AITopics | chatgpt score

Collaborating Authors

chatgpt score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating the quality of published medical research with ChatGPT

Thelwall, Mike, Jiang, Xiaorui, Bath, Peter A.

arXiv.org Artificial IntelligenceNov-4-2024

Research quality evaluation is important for departmental evaluations and academic career decisions. Unfortunately, the evaluators may not have time to fully read the work assessed and may instead rely on the reputation or Journal Impact Factor of the publishing journals, on the citation counts for individual articles, or on the reputation or career citations of the author. Whilst journal-based evidence is not optimal (Waltman & Traag, 2021), the main article-level indicator, citation counts, only directly reflects the scholarly impact of work and not its rigour, originality, and societal impacts (Aksnes, et al., 2019), all of which are relevant quality dimensions (Langfeldt et al., 2020). Moreover, article citation counts are ineffective for newer articles (Wang, 2013). In response, attempts to use Large Language Models (LLMs) to evaluate the quality of academic work have shown that ChatGPT quality scores are at least as effective as citation counts in most fields and substantially better in a few (Thelwall & Yaghi, 2024). Medicine is an exception, however, with ChatGPT research quality scores having a small negative correlation with the mean scores of the submitting department in the Research Excellence Framework (REF) Clinical Medicine Unit of Assessment (UoA) (Thelwall, 2024ab; Thelwall & Yaghi, 2024).

chatgpt, chatgpt score, correlation, (16 more...)

arXiv.org Artificial Intelligence

2411.01952

Country:

Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Marin County > San Rafael (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Assessing the societal influence of academic research with ChatGPT: Impact case study evaluations

Kousha, Kayvan, Thelwall, Mike

arXiv.org Artificial IntelligenceOct-25-2024

Academics and departments are sometimes judged by how their research has benefitted society. For example, the UK Research Excellence Framework (REF) assesses Impact Case Studies (ICS), which are five-page evidence-based claims of societal impacts. This study investigates whether ChatGPT can evaluate societal impact claims and therefore potentially support expert human assessors. For this, various parts of 6,220 public ICS from REF2021 were fed to ChatGPT 4o-mini along with the REF2021 evaluation guidelines, comparing the results with published departmental average ICS scores. The results suggest that the optimal strategy for high correlations with expert scores is to input the title and summary of an ICS but not the remaining text, and to modify the original REF guidelines to encourage a stricter evaluation. The scores generated by this approach correlated positively with departmental average scores in all 34 Units of Assessment (UoAs), with values between 0.18 (Economics and Econometrics) and 0.56 (Psychology, Psychiatry and Neuroscience). At the departmental level, the corresponding correlations were higher, reaching 0.71 for Sport and Exercise Sciences, Leisure and Tourism. Thus, ChatGPT-based ICS evaluations are simple and viable to support or cross-check expert judgments, although their value varies substantially between fields.

correlation, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.19948

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.93)
Health & Medicine > Therapeutic Area (0.48)
Leisure & Entertainment > Sports > Hockey (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models

Lopez-Lira, Alejandro, Tang, Yuehua

arXiv.org Artificial IntelligenceSep-9-2023

We examine the potential of ChatGPT and other large language models in predicting stock market returns using news headlines. We use ChatGPT to assess whether each headline is good, bad, or neutral for firms' stock prices. We document a significantly positive correlation between ChatGPT scores and subsequent daily stock returns. We find that ChatGPT outperforms traditional sentiment analysis methods. More basic models such as GPT-1, GPT-2, and BERT cannot accurately forecast returns, indicating return predictability is an emerging capacity of complex language models. Long-short strategies based on ChatGPT-4 deliver the highest Sharpe ratio. Furthermore, we find predictability in both small and large stocks, suggesting market underreaction to company news. Predictability is stronger among smaller stocks and stocks with bad news, consistent with limits-to-arbitrage also playing an important role. Finally, we propose a new method to evaluate and understand the models' reasoning capabilities. Overall, our results suggest that incorporating advanced language models into the investment decision-making process can yield more accurate predictions and enhance the performance of quantitative trading strategies.

chatgpt, date & permno, stock return, (15 more...)

arXiv.org Artificial Intelligence

2304.07619

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can ChatGPT pass the Vietnamese National High School Graduation Examination?

Dao, Xuan-Quy, Le, Ngoc-Bich, Phan, Xuan-Dung, Ngo, Bac-Bien

arXiv.org Artificial IntelligenceJul-10-2023

This research article highlights the potential of AI-powered chatbots in education and presents the results of using ChatGPT, a large language model, to complete the Vietnamese National High School Graduation Examination (VNHSGE). The study dataset included 30 essays in the literature test case and 1,700 multiple-choice questions designed for other subjects. The results showed that ChatGPT was able to pass the examination with an average score of 6-7, demonstrating the technology's potential to revolutionize the educational landscape. The analysis of ChatGPT performance revealed its proficiency in a range of subjects, including mathematics, English, physics, chemistry, biology, history, geography, civic education, and literature, which suggests its potential to provide effective support for learners. However, further research is needed to assess ChatGPT performance on more complex exam questions and its potential to support learners in different contexts. As technology continues to evolve and improve, we can expect to see the use of AI tools like ChatGPT become increasingly common in educational settings, ultimately enhancing the educational experience for both students and educators.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.0917

Country:

North America > United States (0.14)
Asia > Vietnam (0.05)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting > K-12 Education > Secondary School (0.72)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback